Noninvasive Models to Assess Liver Inflammation and Fibrosis in Chronic HBV Infected Patients with Normal or Mildly Elevated Alanine Transaminase Levels: Which One Is Most Suitable?

The prevalence of substantial inflammation or fibrosis in treatment-naïve patients with chronic hepatitis B (CHB) and normal alanine transaminase (ALT) levels is high. A retrospective analysis was conducted on 559 consecutive patients with hepatitis B virus infection, who underwent liver biopsy, to investigate the value of noninvasive models based on routine serum markers for evaluating liver histology in CHB patients with normal or mildly elevated ALT levels and to provide treatment guidance. After comparing 55 models, we identified the top three models that exhibited excellent performance. The APGA model, based on the area under the receiver operating characteristic curve (AUROC), demonstrated a superior ability to evaluate significant (AUROC = 0.750) and advanced fibrosis (AUROC = 0.832) and demonstrated a good performance in assessing liver inflammation (AUROCs = 0.779 and 0.874 for stages G ≥ 2 and G ≥ 3, respectively). APGA also exhibited significant correlations with liver inflammation and fibrosis stage (correlation coefficients, 0.452 and 0.405, respectively (p < 0.001)). When the patients were stratified into groups based on HBeAg status and ALT level, APGA consistently outperformed the other 54 models. The other top two models, GAPI and XIE, also outperformed models based on other chronic hepatitis diseases. APGA may be the most suitable option for detecting liver fibrosis and inflammation in Chinese patients with CHB.


Introduction
The hepatitis B virus (HBV) infection continues to pose a global public health challenge, despite the decreasing trend in HBV infection rates over the years due to widespread access to the hepatitis B vaccination and preventive measures.However, according to the World Health Organization (WHO), an estimated 296 million individuals were living with chronic hepatitis B (CHB) in 2019, with approximately 1.5 million new infections occurring annually.Moreover, there were an estimated 820,000 deaths primarily caused by cirrhosis and hepatocellular carcinoma (HCC), with a significant proportion of these fatalities occurring in the Asia-Pacific region [1].
Previous studies suggested that patients in the immune-tolerant (IT) phase have slow disease progression due to little inflammation or fibrosis in the liver [2,3].Numerous studies have shown that one-third to one-half of treatment-naïve CHB patients with normal alanine transaminase (ALT) levels may still experience substantial inflammation or fibrosis and may even have increased risks of HCC and death/transplantation [4].Consequently, these patients require antiviral therapy, which has exhibited comparable efficacy [5,6].Based on these findings, the indications for antiviral therapy have recently been expanded [7,8], aiming to prevent unnecessary deaths through early intervention in selected IT-phase 2 of 19 patients [4].Nevertheless, IT-phase patients continue to exhibit poor rates of seroconversion after receiving antiviral therapy and are more prone to developing treatment resistance [9].Therefore, determining the state of liver histology at the commencement of treatment is crucial [3].
Although histological examination has long been regarded as the gold standard for assessing liver inflammation and fibrosis, the risks and costs associated with liver biopsy have limited its widespread use.Individuals with normal ALT levels are often hesitant to undergo this invasive procedure.Consequently, noninvasive methods for predicting liver fibrosis, including noninvasive models, transient elastography (TE), two-dimensional shear wave elastography (2D-SWE), and the FibroTest, have been proposed over the past few decades.Noninvasive and repeatable assessments of liver fibrosis are widely facilitated by TE, FibroTest, and 2D-SWE [10][11][12].However, the diagnostic accuracy of these tests may be influenced by the operator's expertise, the absence of extensively validated cutoff values for specific stages of fibrosis, increasing rates of unreliability in patients with higher obesity, and the high cost of the equipment.These factors could limit the clinical utility of these tests in primary hospitals [13,14].Given these findings, numerous noninvasive models based on serum markers, including the widely recommended Fibrosis-4 (FIB-4) and aminotransferase-to-platelet ratio index (APRI), have been developed and extensively discussed in the past two decades [15][16][17][18][19]. Furthermore, a majority of the serum-based noninvasive models were developed primarily using patients with chronic hepatitis C (CHC), while only a few were established using CHB cohorts.Additionally, other than FIB-4 and APRI, which are endorsed by guidelines or general consensus, no models have demonstrated satisfactory outcomes in clinical practice.Consequently, an increasing number of novel noninvasive models and nomograms, such as the IT [20], PAPAS [21], APRG [22], APGA [23], and Chen models, have also emerged for assessing fibrosis in CHB patients [24].
In this study, our objective was to evaluate the diagnostic accuracy of non-patented, noninvasive models, nomograms or indices based solely on routine serum biomarker data.These models are suitable for implementation in almost any medical facility and can effectively discriminate liver fibrosis and cirrhosis in both treatment-naïve individuals (including those with ALT < 2× the upper limit of normal [ULN]) and treated CHB patients who have discontinued antiviral therapy in China.The findings from this study may assist healthcare professionals in making appropriate decisions regarding antiviral treatment and potentially reduce the necessity for liver biopsy among CHB patients with normal or mildly elevated ALT levels through the utilization of a highly suitable model.

Patients
From January 2017 to December 2022, a total of 844 patients with CHB who had undergone liver biopsy at the Third Affiliated Hospital of Sun Yat-sen University were included in this study.The criteria for eligibility were as follows: (1) chronic HBV infection for over 6 months; (2) ALT and/or aspartate transaminase (AST) levels less than 2 times the ULN (ALT and AST ULN = 40 U/L) [25] at least 6 months before enrolling; (3) patients who had not previously received antiviral therapy, had received a transient course of antiviral therapy, or had stopped antiviral therapy for more than a year while HBV-DNA remained positive; (4) age of 16-70 years old; and (5) the liver biopsy tissue met the immunohistochemical requirements with an available pathological report.The overall exclusion of 245 patients was based on the following criteria: (1) ALT and/or AST levels exceeding 2 times the ULN; (2) did not discontinue antiviral therapy prior to liver puncture; (3) liver disease of other etiologies, such as viral coinfection, autoimmune hepatitis (AIH), primary biliary cholangitis (PBC), primary sclerosing cholangitis (PSC), alcohol-associated liver disease (ALD), or nonalcoholic fatty liver disease (NAFLD); (4) liver cirrhosis (LC) or carcinoma; (5) aged > 70 years or younger than 16 years; (6) systemic diseases affecting the liver, such as HIV infection, heart failure, or hyperthyroidism; and (7) insufficient data availability.
A total of 599 patients who met the eligibility criteria were included in the final study cohort.Among them, 514 patients did not receive any antiviral treatment, while 85 patients had received nucleoside analogs (NAs) and/or peg-IFN antiviral therapy but had discontinued it for more than 1 year before liver puncture.Additionally, out of these patients, 221 were HBeAg positive, 377 were HBeAg negative, and one patient had missing HBeAg data.The detailed study design is illustrated in Figure 1.
Diagnostics 2024, 14, x FOR PEER REVIEW 3 of 18 primary biliary cholangitis (PBC), primary sclerosing cholangitis (PSC), alcohol-associated liver disease (ALD), or nonalcoholic fatty liver disease (NAFLD); (4) liver cirrhosis (LC) or carcinoma; (5) aged > 70 years or younger than 16 years; (6) systemic diseases affecting the liver, such as HIV infection, heart failure, or hyperthyroidism; and (7) insufficient data availability.A total of 599 patients who met the eligibility criteria were included in the final study cohort.Among them, 514 patients did not receive any antiviral treatment, while 85 patients had received nucleoside analogs (NAs) and/or peg-IFN antiviral therapy but had discontinued it for more than 1 year before liver puncture.Additionally, out of these patients, 221 were HBeAg positive, 377 were HBeAg negative, and one patient had missing HBeAg data.The detailed study design is illustrated in Figure 1.The clinical data of eligible patients were retrospectively collected either on the day of liver biopsy or one week prior.This comprehensive dataset included demographic information, antiviral therapy history, and routine serum marker data commonly used in medical facilities.These markers encompassed the liver function indicators (such as ALB, GLO, TBIL, DBIL, ALT, AST, GGT, and ALP); coagulation function markers (e.g., PT, PTA, and INR); total cholesterol levels; apolipoprotein A1 levels; glucose levels; complete blood count values; and serological markers for HBV infection, including HBsAg, HBeAg, anti-HBe, anti-HBc, HBV DNA load and AFP levels.All of the above biomarkers were assessed at the Clinical Laboratory of the Third Affiliated Hospital of Sun Yat-sen University.Additionally, imaging examinations such as ultrasound scans or CT/MR scans were performed in the Radiology and Ultrasound Department to exclude cirrhosis and hepatocellular carcinoma (HCC).

Liver Histological Examination
Liver biopsies were performed via the percutaneous echo-assisted technique, with a minimum requirement of 6 portal tracts.The slides were examined and interpreted by two pathologists at the Third Affiliated Hospital of Sun Yat-sen University.Biopsies were categorized into stages based on the Scheuer scoring system [26]: G 0-4 and S 0-4.In this study, significant fibrosis (SF) and advanced fibrosis (AF) were defined as pathological stages ≥ S2 and ≥S3, respectively.Significant inflammation and severe inflammation were defined as pathological stages ≥ G2 and ≥G3, respectively.

Noninvasive Models
A literature search was conducted in the PubMed, Chinese National Knowledge Infrastructure (CNKI), and Web of Science databases and the relevant literature citations from January 2000 to December 2022 were noted to identify non-patented liver fibrosis models, nomograms, and indices utilizing serum markers.These models have been previously used for assessing fibrosis in patients with viral hepatitis B or C, PBC, or NAFLD, among others.Initially, more than ninety noninvasive models were collected; however, only 55 were included after excluding models that lacked important indices, such as haptoglobin, hyaluronic acid (HA), matrix metalloproteinase-1 (MMP-1), procollagen III N-terminal peptide (PIIINP), soluble CD163 (sCD163), and α2-macroglobulin (α2-MG); the quantification of HBeAg and anti-HBc antibodies; and the liver stiffness measure (LSM) by transient elastography [24,[27][28][29][30][31].The included models are summarized in Table S1.

Statistical Analysis
The data were analyzed using SPSS software version 20.0 (IBM Corp, Armonk, NY, USA) and GraphPad Prism Software version 8.0 (GraphPad Software).The one-sample Kolmogorov-Smirnov test was used for a normality analysis, and the data are presented as mean values ± standard deviations or median values with interquartile ranges (P25, P75).Student's t test and the Mann-Whitney U test were used for group comparisons.Categorical variables were expressed as numbers and percentages and were compared using the chi-square test.Areas under the receiver operating characteristic curve (AUROCs) with 95% confidence intervals (CIs) were utilized to evaluate the performance of the noninvasive models in diagnosing liver inflammation and fibrosis.Correlation analysis between histopathological fibrosis stage and liver inflammation was conducted using either Pearson's or Spearman rank test based on the normality of the distribution of the parameter in question.Optimal cutoffs were determined by maximizing Youden's index.The diagnostic metrics included sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (PLR), and negative likelihood ratio (NLR).A p value less than 0.05 (2-tailed) indicated statistical significance.
To comprehensively compare the accuracy of the noninvasive models, we adopted a grading system derived from Dong's study [32] (Table 1) and used it to assess both fibrosis and inflammation.A model with an AUROC below 0.700 scored 0, 0.700~0.750scored 1, 0.750~0.800scored 2, and >0.800 scored 3. The final score for each model was determined by summing the scores for S ≥ 2, S ≥ 3, G ≥ 2, and G ≥ 3. Subsequently, a model scoring 0-4 was designated Grade C, representing low diagnostic efficiency; a model scoring 5-8 was designated Grade B, representing moderate diagnostic efficiency; and a model scoring 9-12 was designated Grade A, representing high diagnostic efficiency.The grading system was also applied to validate liver inflammation and liver fibrosis based on the HBeAg status or ALT levels.

Patient Characteristics
The study design is illustrated in Figure 1.From January 2017 to December 2022, a total of 844 CHB patients who had undergone liver biopsy were recruited for this study.After applying the eligibility criteria, a cohort of 599 patients were enrolled as study subjects for subsequent analysis.There were 514 patients who were not treated with any NAs or peg-IFN, while 85 patients were treated with NAs and/or peg-IFN but had discontinued them for more than 1 year before liver puncture.According to the Scheuer scoring system, S ≥ 2 and/or G ≥ 2 were defined as significant pathological changes in liver injury (SHCHI), and we divided patients into two groups.In respectively (p = 0.773); moreover, there were no significant differences between these two groups.The comprehensive demographic and laboratory parameters of the subjects are presented in Table 2.
and inflammation.A model with an AUROC below 0.700 scored 0, 0.700~0.750scored 1, 0.750~0.800scored 2, and >0.800 scored 3. The final score for each model was determined by summing the scores for S ≥ 2, S ≥ 3, G ≥ 2, and G ≥ 3. Subsequently, a model scoring 0-4 was designated Grade C, representing low diagnostic efficiency; a model scoring 5-8 was designated Grade B, representing moderate diagnostic efficiency; and a model scoring 9-12 was designated Grade A, representing high diagnostic efficiency.The grading system was also applied to validate liver inflammation and liver fibrosis based on the HBeAg status or ALT levels.
Table 1.Grading system based on AUROCs derived from the data of this study for the evaluation of noninvasive models.

Patient Characteristics
The study design is illustrated in Figure 1.From January 2017 to December 2022, a total of 844 CHB patients who had undergone liver biopsy were recruited for this study.After applying the eligibility criteria, a cohort of 599 patients were enrolled as study subjects for subsequent analysis.There were 514 patients who were not treated with any NAs or peg-IFN, while 85 patients were treated with NAs and/or peg-IFN but had discontinued them for more than 1 year before liver puncture.According to the Scheuer scoring system, S ≥ 2 and/or G ≥ 2 were defined as significant pathological changes in liver injury (SHCHI), and we divided patients into two groups.In  (3.675, 6.11), respectively (p = 0.773); moreover, there were no significant differences between these two groups.The comprehensive demographic and laboratory parameters of the subjects are presented in Table 2.

Validation of Noninvasive Models in All Patients
Due to the limited sample size in the S4 group, potential bias may exist.Therefore, a comparison of models was conducted between patients with and without SF (S ≥ 2) and AF (S ≥ 3) across the entire population.The AUROCs for each model for discriminating between SF and AF were calculated and are summarized in Table 3.
In general, the 55 noninvasive models were found to be effective at diagnosing both SF and AF among all patients, regardless of their HBeAg status or AST/ALT levels.However, not all the models demonstrated satisfactory performance.In the discrimination of SF among all patients, four models exhibited AUROCs higher than 0.700: the APGA (0.750), APPCI (0.726), GAPI (0.719), and Xie models (0.714).For identifying AF, the AUROC of the APGA model (0.832) exceeded 0.800, while the three models with the next highest AUROCs were GAPI (0.802), Xie model (0.80), and S index (0.797).Overall, each noninvasive model generally displayed a greater AUROC for diagnosing AF than for diagnosing SF, as presented in Table 3 and Figure 3.    Note: In this study, significant fibrosis (SF) and advanced fibrosis (AF) were defined as pathological stages ≥ S2 and ≥S3, respectively.Significant inflammation and severe inflammation were defined as pathological stages ≥ G2 and ≥G3, respectively.Most noninvasive models have been utilized for discriminating liver fibrosis stages but have limited application in validating inflammation.Hence, our study aimed to validate the efficacy of these models in distinguishing liver inflammation.To compare the performance of different models between patients with and without G ≥ 2 and G ≥ 3, we calculated the AUROCs for each model; these values are summarized in Table 3 and Fig Most noninvasive models have been utilized for discriminating liver fibrosis stages but have limited application in validating inflammation.Hence, our study aimed to validate the efficacy of these models in distinguishing liver inflammation.To compare the performance of different models between patients with and without G ≥ 2 and G ≥ 3, we calculated the AUROCs for each model; these values are summarized in Table 3 and Figure 3.
The four models with the highest AUROCs for identifying G ≥ 2 in all patients were APGA (0.779), Xie model (0.744), AGAP (0.733), and Logit(Y) (0.725); the AUROC of GAPI was 0.705.For discriminating G ≥ 3, the four models with the highest AUROCs were APGA (0.874), Xie model (0.861), Wang I (0.859), and AGAP (0.857); the AUROC of GAPI was 0.834.Generally, there was a gradual increase in the AUROC of each noninvasive model for diagnosing inflammation as the inflammation grade increased, as shown in Table 3 and Figure 3.

Evaluation and Comparison of Noninvasive Models in the HBeAg-Negative and HBeAg-Positive Groups
The patients were stratified into HBeAg-positive and HBeAg-negative groups for comparative analysis and further validation.In general, the noninvasive models exhibited higher AUROCs in the HBeAg-positive group than in the HBeAg-negative group for discriminating each fibrosis stage, indicating that these models may be more appropriate for managing HBeAg-positive patients.

Reassessment and Comparison of Noninvasive Models in Patients with Varying Levels of ALT below Two Times the ULN (ULN = 40 U/L)
Initially, we excluded patients whose ALT or AST levels exceeded 80 U/L.Subsequently, we assessed the diagnostic performances of the 55 noninvasive models in CHB patients with varying ALT levels below the ULN of 40 U/L.Among these models, the APGA, AGAP, GAPI, S-index, and XIE models exhibited significant differences in their ability to distinguish liver fibrosis and necroinflammation stages when ALT was below the ULN.Additionally, APGA, FIB-6, FI, and RPR demonstrated potential for a good performance when the ALT concentration exceeded the ULN (Tables S2 and S3).
Finally, we evaluated 29 models by excluding those with parameters that could not be obtained from routine laboratory tests (Table S1).Other than APGA, there were other models with excellent predictive value for SF and AF, with AUROCs exceeding 0.700.Among these, the four best-performing models were GAPI (0.719, 0.802), the XIE model (0.714, 0.8), AGAP (0.713, 0.794), and the S-index (0.708, 0.797).The performance of other models that were developed based on chronic hepatitis C, PBC or NAFLD cohorts varied in our study.However, GUCI, Fibro-α, FI, APRI, and King's score performed better in our cohort, especially for AF patients, with AUROCs higher than 0.700.The AUROCs of the CHB cohort-based models, such as the APGA, GAPI, and XIE models, were greater than those of the recommended models, such as the APRI and FIB-4, for single patients, as shown in Figure 4. models that were developed based on chronic hepatitis C, PBC or NAFLD cohorts varied in our study.However, GUCI, Fibro-α, FI, APRI, and King's score performed better in our cohort, especially for AF patients, with AUROCs higher than 0.700.The AUROCs of the CHB cohort-based models, such as the APGA, GAPI, and XIE models, were greater than those of the recommended models, such as the APRI and FIB-4, for single patients, as shown in Figure 4.

Comprehensive Evaluation of Noninvasive Models
Given the variability of the noninvasive models with superior performances across different fibrosis stages, it is challenging to identify one or a few models that are unequivocally superior.To evaluate the noninvasive models, we adopted the scoring system developed by Dong [32] (Table 1).According to this grading system, only APGA was classified as grade A with high diagnostic value for discriminating both liver fibrosis stage and liver necroinflammation in all patients.Additionally, the other two noninvasive models, GAPI and the XIE model, also exhibited improved performance (Table 4 and Figure 4).

Comprehensive Evaluation of Noninvasive Models
Given the variability of the noninvasive models with superior performances across different fibrosis stages, it is challenging to identify one or a few models that are unequivocally superior.To evaluate the noninvasive models, we adopted the scoring system developed by Dong [32] (Table 1).According to this grading system, only APGA was classified as grade A with high diagnostic value for discriminating both liver fibrosis stage and liver necroinflammation in all patients.Additionally, the other two noninvasive models, GAPI and the XIE model, also exhibited improved performance (Table 4 and Figure 4).
A Spearman correlation analysis (Table 5) was further employed to evaluate the associations between the serum marker levels, liver inflammation grade, and fibrosis stage.Both inflammation and liver fibrosis stages were found to be significantly correlated with the common parameters included in the noninvasive models or indices.Notably, among the three highly effective noninvasive models, the APGA index exhibited remarkably positive correlations with liver inflammation and fibrosis, with Spearman's scores of 0.452 and 0.405, respectively.The diagnostic performance of the top three models in our study for liver fibrosis and inflammation prediction is presented in Table 6, along with the corresponding cutoff values, sensitivity, specificity, NLR, NPV, PLR, and PPV.Additionally, the APGA index demonstrated a consistently favorable performance.

Discussion
The HBV infection remains a global public health challenge, and cirrhosis and HCC result in high morbidity and mortality rates.This issue has also had significant economic and societal impacts [34].Recently, the indications for antiviral therapy have been expanded to include more chronic HBV patients, who require longer treatment courses before they meet the criteria for discontinuation.However, prolonged treatment can lead to concerns, such as asymptomatic patients starting medication without taking their illness seriously or adhering poorly to medication regimens, potentially leading to adverse drug reactions, drug resistance, high recurrence rates of hepatitis, serious liver failure requiring hospitalization or transplantation, or even death [35].Therefore, it is important to assess the liver inflammation and fibrosis stage prior to initiating antiviral treatments to improve awareness of the disease and avoid these issues while benefiting patients.
In addition to liver biopsy data and recommended models, such as FIB-4 and the APRI index according to various guidelines, an increasing number of models based on CHB cohorts are being developed using both common clinical inspection indices and innovative indices, including MMP-1, PIIINP, sCD163, α2-MG, quantitative HBeAg and anti-HBc, as well as TE, 2D-SWE and the FibroTest.Models incorporating innovative indices consistently demonstrate a superior performance in distinguishing different stages of liver fibrosis.However, the implementation of new serum indices often requires additional detection methods that may be costly and operator dependent.Moreover, the markers utilized in these prediction models may not be readily available in routine non-research laboratories in China.Therefore, this study focused on evaluating models solely based on commonly used and easily obtainable serum clinical indices that have been previously employed for predicting liver fibrosis.Particular emphasis is placed on models derived from HBV cohorts.
In the present study, we included 55 models obtained through comprehensive searches of PubMed, CNKI, and other databases.The formulas or nomograms are presented in Table S1.The majority of these models were developed based on routine laboratory parameters, including AST, AFP, ALP, ALB, GGT, PLT, INR and PTA; however, some models have incorporated quantitative measurements of HBsAg and HBV DNA.
Among the models used to predict liver fibrosis stages, a noninvasive predictive model called APGA, which was classified as grade A in our study (Table 4), was developed by James Fung [23].The APGA model utilizes AST, platelet count, GGT, and AFP to predict both severe fibrosis and cirrhosis.In the original study, this model achieved an AUROC of 0.85.After conducting our own analysis on our cohort, we found that the model had high diagnostic value, with AUROCs of 0.750 and 0.832 for SF and AF, respectively.However, liver fibrosis and cirrhosis were measured using TE in previous studies, with cutoff values >8.1 kPa and >10.3 kPa, respectively.This may limit the correlation of the model results with actual liver histology.Seto [21] evaluated this model in another Chinese CHB patient cohort and obtained results consistent with ours regarding the accuracy of SF prediction.However, Erdogan et al. [36] evaluated the diagnostic value of this model in a Turkish cohort without achieving comparable results.When we divided patients into two groups based on HBeAg status and ALT levels, the APGA model demonstrated exceptional performance in assessing fibrosis levels across all populations as well as among patients with different HBeAg states or ALT levels.Furthermore, the model results were positively correlated with liver histology, as shown in Table 5.Additionally, compared to the other models assessed in this study, APGA exhibited better overall performance stability when predicting liver fibrosis across various populations while maintaining relatively higher sensitivity, specificity, and NPV, as presented in Table 6.
We further discussed and explored the discriminative value of noninvasive models for liver inflammation.Regardless of the HBeAg state and ALT level, APGA also exhibited strong discriminatory ability for G ≥ 2 and G ≥ 3, with AUROCs of 0.779 and 0.874, respectively.In addition to this model, another logistic regression model (LRM, referred to as the XIE model in this article) [37] was developed to predict liver necroinflammation in patients with hepatitis B e antigen negative CHB with normal or minimally elevated ALT levels.The XIE model achieved similar results, with AUROCs of 0.744 and 0.861 in our study (Table 3).The AUROC for predicting inflammation grade was greater in the HBeAg-negative group than in the HBeAg-positive group, primarily because this model was initially developed in HBeAg-negative patients.However, there is limited research available on the application of this noninvasive model for predicting liver fibrosis.
In this article, we conducted a comprehensive search for noninvasive models developed on CHB cohorts and ultimately evaluated 29 models (Table S1) after excluding those that required parameters not obtainable through routine laboratory tests.Among all of the patients, we identified several models in addition to the APGA model mentioned above that exhibited excellent predictive value for liver fibrosis and inflammation, with AUROCs exceeding 0.700.Notably, GAPI and the XIE model (LRM) [37] were the two top-performing models according to our analysis (as shown in the Tables).Our findings were consistent with previous studies demonstrating that GAPI [38], a novel fibrosis index utilizing γ-glutamyl transpeptidase (GGT), age, platelet count, and international normalized ratio (INR), exhibited a superior performance in predicting SF and AF among CHB patients based on higher AUROC values obtained from our cohorts.In addition to these outstanding models, several other models have been developed specifically for CHB patients, such as the AA index [39], mFIB-4 score [40], IT model [20], and PNALT; however, their AUROCs were below 0.700, indicating a poorer predictive performance.
The other models based on CHC, PBC, or NAFLD patients exhibited disparate performances in the present study.Notably, the models Fibro-α [41], FIB-6 [42], Virahep-C [43], APRI [16], FCI [44], GUCI [45], and Forns [46] demonstrated a superior performance in our cohort, particularly for AF, with AUROCs exceeding 0.700.Conversely, the HGM-2 [47], FIB-5 [48], RLR [49], and NLR models exhibited poor performance, with AUROCs of less than 0.500.In the case of HGM-2, which incorporates PLT count, INR, ALP, and AST as predictors, the AUROC for predicting F ≥ 3 in the HCV/HIV infection group was reported to be 0.844 and 0.815 for the estimation and validation groups, respectively, in the original article, indicating a remarkable ability to predict AF [47].However, this result is significantly greater than our findings among CHB patients, suggesting that this model developed based on CHC cohorts may not be applicable to CHB cohorts.Another study applied HGM-2 in HCV patients and obtained favorable results, with AUROCs of 0.809, 0.898 and 0.909 for SF, AF, and cirrhosis, respectively [50].Similarly, AUROCs of 0.843 and 0.917 for AF and cirrhosis, respectively, were reported for another HCV cohort [51].Prior to our study, no research had utilized this model to predict liver fibrosis stage in CHB patients, further supporting its potential unsuitability for such cohorts.
The APRI and FIB-4 scores have been highly regarded and extensively discussed in various major guidelines for the management of CHB.However, their diagnostic accuracies have not yielded consistent results across reported studies.Based on the data from this study, both APRI and FIB-4 demonstrated moderate diagnostic accuracy.When we attempted to utilize these models for distinguishing between different stages of liver inflammation, we observed that they exhibited superior discriminatory performance for liver inflammation compared to fibrosis, as evidenced by higher AUROC values (as shown in Table 3), which is consistent with previously reported findings in CHB patients [32,52,53].In our study, although their performances were slightly inferior to other novel models for diagnosing fibrosis and inflammation (as presented in the tables) [31,54], when comparing the AUROC of the APGA model with these two models, APGA was found to outperform them in the evaluation of liver histology(Figure 4).
Despite the relatively large cohort, it is important to acknowledge the limitations of this study, including its single-center and retrospective nature.Inevitably, there may be bias resulting from missing data and selection bias, particularly admission bias.The differences between our findings and previously reported results could be attributed to variations in sample size, liver histology validation methods, upper limits of detection for quantifying HBsAg and HBV DNA, and differences in study subjects; all of these factors have the potential to impact the outcomes.Furthermore, we observed that the APGA model exhibited a superior and consistent ability to distinguish liver fibrosis and necroinflammation; however, further verification is still needed in other cohorts.
In conclusion, the results of the present study conducted in a large Chinese CHB cohort suggested that among the 55 noninvasive models for staging liver fibrosis, the APGA model demonstrated superior accuracy in staging both liver fibrosis and inflammation.Generally, models developed based on CHB cohorts outperform those developed on other chronic liver disease patients.The predictive value of these models may be influenced by HBeAg status and ALT levels; however, the APGA model consistently outperformed the other models in predicting liver fibrosis and inflammation.This model has the potential to reduce the reliance on liver biopsy for antiviral treatment guidance and significantly improve patient compliance.Although most models showed a satisfactory performance in our study, certain biases hinder their practical application.Therefore, further investigations are needed to develop innovative, noninvasive models with enhanced practicability for assessing liver staging and dynamic monitoring.

Supplementary Materials:
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/diagnostics14050456/s1,Table S1.Calculations of the 55 noninvasive models; Table S2.Area under the ROC curve(AUROC) of models for liver fibrosis and necroinflammation with different levels of ALT under 2 ULN (ULN = 40 U/L); Table S3.Grade distribution of the 55 noninvasive models for diagnosing liver fibrosis and necroinflammation under different ALT.

Figure 2 .
Figure 2. The incidences of liver fibrosis and inflammation in all patients and in the different groups ((A) liver fibrosis; (B) liver inflammation)).

Figure 2 .
Figure 2. The incidences of liver fibrosis and inflammation in all patients and in the different groups ((A) liver fibrosis; (B) liver inflammation)).

Table 3 .
Area under the ROC curve (AUROC) of the models for fibrosis stage and necroinflammation in the total population and different groups.

Figure 3 .
Figure 3.Of the 55 studied models, 10 models exhibited the best performance for evaluating liver fibrosis (A) and inflammation (B) in all patients.

- ure 3 .Figure 3 .
Figure 3.Of the 55 studied models, 10 models exhibited the best performance for evaluating liver fibrosis (A) and inflammation (B) in all patients.

Author
Contributions: S.M., L.Z. and S.L. collected the data.S.M. and M.L. interpreted the data.S.M. and J.L. drafted the manuscript, and L.C. revised it.L.C. and S.M. designed this project and mapped the structure.All authors have read and agreed to the published version of the manuscript.Funding: This study was supported by the General Planned Project of Guangzhou Science and Technology (grant number 202102010195).Institutional Review Board Statement: This retrospective chart review study involving human participants was performed in accordance with the ethical standards of the institutional and national research committees and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.The Human Investigation Committee (IRB) of the Third Affiliated Hospital of Sun Yat-sen University approved this study(II2023-061-01, 29 March 2023).Informed Consent Statement: Patient consent was waived due to the retrospective nature of this study.

Table 1 .
Grading system based on AUROCs derived from the data of this study for the evaluation of noninvasive models.

Table 2 .
Baseline characteristics of the enrolled patients.

Table 4 .
Grade distribution of the 55 noninvasive models for diagnosing liver fibrosis and necroinflammation.

Table 4 .
Grade distribution of the 55 noninvasive models for diagnosing liver fibrosis and necroinflammation.
Note: The final score of a model was the sum of the four scores.A model scoring 0-4 was designated Grade C, representing low diagnostic efficiency; a model scoring 5-8 was designated Grade B, representing moderate diagnostic efficiency; and a model scoring 9-12 was designated Grade A, representing high diagnostic efficiency.

Table 5 .
Correlations between noninvasive models or indices and the liver histology Spearman's score.

Table 6 .
Diagnostic performance of the top three models in predicting liver fibrosis and inflammation across all populations.